14 research outputs found

    Deeper-GXX: Deepening Arbitrary GNNs

    Full text link
    Shallow GNNs tend to have sub-optimal performance dealing with large-scale graphs or graphs with missing features. Therefore, it is necessary to increase the depth (i.e., the number of layers) of GNNs to capture more latent knowledge of the input data. On the other hand, including more layers in GNNs typically decreases their performance due to, e.g., vanishing gradient and oversmoothing. Existing methods (e.g., PairNorm and DropEdge) mainly focus on addressing oversmoothing, but they suffer from some drawbacks such as requiring hard-to-acquire knowledge or having large training randomness. In addition, these methods simply incorporate ResNet to address vanishing gradient. They ignore an important fact: by stacking more and more layers with ResNet architecture, the information collected from faraway neighbors becomes dominant, compared with the information collected from the 1-hop and 2-hop neighbors, thus resulting in severe performance degradation. In this paper, we first go deep into the architecture of ResNet and analyze why ResNet is not best suited for deeper GNNs. Then we propose a new residual architecture to attenuate the negative impact caused by ResNet. To address the drawbacks of these existing methods, we introduce the Topology-guided Graph Contrastive Loss named TGCL. It utilizes node topological information and pulls the connected node pairs closer via contrastive learning regularization to obtain discriminative node representations. Combining the new residual architecture with TGCL, an end-to-end framework named Deeper-GXX is proposed towards deeper GNNs. The extensive experiments on real-world data sets demonstrate the effectiveness and efficiency of Deeper-GXX compared with state-of-the-art baselines

    FairGen: Towards Fair Graph Generation

    Full text link
    There have been tremendous efforts over the past decades dedicated to the generation of realistic graphs in a variety of domains, ranging from social networks to computer networks, from gene regulatory networks to online transaction networks. Despite the remarkable success, the vast majority of these works are unsupervised in nature and are typically trained to minimize the expected graph reconstruction loss, which would result in the representation disparity issue in the generated graphs, i.e., the protected groups (often minorities) contribute less to the objective and thus suffer from systematically higher errors. In this paper, we aim to tailor graph generation to downstream mining tasks by leveraging label information and user-preferred parity constraint. In particular, we start from the investigation of representation disparity in the context of graph generative models. To mitigate the disparity, we propose a fairness-aware graph generative model named FairGen. Our model jointly trains a label-informed graph generation module and a fair representation learning module by progressively learning the behaviors of the protected and unprotected groups, from the `easy' concepts to the `hard' ones. In addition, we propose a generic context sampling strategy for graph generative models, which is proven to be capable of fairly capturing the contextual information of each group with a high probability. Experimental results on seven real-world data sets, including web-based graphs, demonstrate that FairGen (1) obtains performance on par with state-of-the-art graph generative models across six network properties, (2) mitigates the representation disparity issues in the generated graphs, and (3) substantially boosts the model performance by up to 17% in downstream tasks via data augmentation

    Self-planning Code Generation with Large Language Models

    Full text link
    Although large language models have demonstrated impressive ability in code generation, they are still struggling to address the complicated intent provided by humans. It is widely acknowledged that humans typically employ planning to decompose complex problems and schedule the solution steps prior to implementation. Thus we introduce planning into code generation to help the model understand complex intent and reduce the difficulty of problem solving. This paper proposes a self-planning code generation method with large language model, which consists of two phases, namely planning phase and implementation phase. Specifically, in the planning phase, the language model plans out the solution steps from the intent combined with in-context learning. Then it enters the implementation phase, where the model generates code step by step, guided by the solution steps. The effectiveness of self-planning code generation has been rigorously evaluated on multiple code generation datasets and the results have demonstrated a marked superiority over naive direct generation approaches with language model. The improvement in performance is substantial, highlighting the significance of self-planning in code generation tasks

    Outlier Impact Characterization for Time Series Data

    No full text
    For time series data, certain types of outliers are intrinsically more harmful for parameter estimation and future predictions than others, irrespective of their frequency. In this paper, for the first time, we study the characteristics of such outliers through the lens of the influence functional from robust statistics. In particular, we consider the input time series as a contaminated process, with the recurring outliers generated from an unknown contaminating process. Then we leverage the influence functional to understand the impact of the contaminating process on parameter estimation. The influence functional results in a multi-dimensional vector that measures the sensitivity of the predictive model to the contaminating process, which can be challenging to interpret especially for models with a large number of parameters. To this end, we further propose a comprehensive single-valued metric (the SIF) to measure outlier impacts on future predictions. It provides a quantitative measure regarding the outlier impacts, which can be used in a variety of scenarios, such as the evaluation of outlier detection methods, the creation of more harmful outliers, etc. The empirical results on multiple real data sets demonstrate the effectivenss of the proposed SIF metric

    Analysis of the ASMT Gene Family in Pepper (Capsicum annuum L.): Identification, Phylogeny, and Expression Profiles

    No full text
    Acetylserotonin methyltransferase (ASMT) in plant species, one of the most important enzymes in melatonin biosynthesis, plays a rate-limiting role in the melatonin production. In this study, based on the whole genome sequence, we performed a systematic analysis for the ASMT gene family in pepper (Capsicum annuum L.) and analyzed their expression profiles during growth and development, as well as abiotic stresses. The results showed that at least 16 CaASMT genes were identified in the pepper genome. Phylogenetic analyses of all the CaASMTs were divided into three groups (group I, group II, and group III) with a high bootstrap value. Through the online MEME tool, six distinct motifs (motif 1 to motif 6) were identified. Chromosome location found that most CaASMT genes were mapped in the distal ends of the pepper chromosomes. In addition, RNA-seq analysis revealed that, during the vegetative and reproductive development, the difference in abundance and distinct expression patterns of these CaASMT genes suggests different functions. The qRT-PCR analysis showed that high abundance of CaASMT03, CaASMT04, and CaASMT06 occurred in mature green fruit and mature red fruit. Finally, using RNA-seq and qRT-PCR technology, we also found that several CaASMT genes were induced under abiotic stress conditions. The results will not only contribute to elucidate the evolutionary relationship of ASMT genes but also ascertain the biological function in pepper plant response to abiotic stresses

    Scalable production of few-layer niobium disulfide nanosheets via electrochemical exfoliation for energy-efficient hydrogen evolution reaction

    No full text
    Two-dimensional (2D) niobium disulfide (NbS2) materials feature unique physical and chemical properties leading to highly promising energy conversion applications. Herein, we developed a robust synthesis technique consisting of electrochemical exfoliation under alternating currents and subsequent liquid-phase exfoliation to prepare highly uniform few-layer NbS2 nanosheets. The obtained few-layer NbS2 material has a 2D nanosheet structure with an ultrathin thickness of ∼3 nm and a lateral size of ∼2 μm. Benefiting from their unique 2D structure and highly exposed active sites, the few-layer NbS2 nanosheets drop-casted on carbon paper exhibited excellent catalytic activity for the hydrogen evolution reaction (HER) in acid with an overpotential of 90 mV at a current density of 10 mA cm–2 and a low Tafel slope of 83 mV dec–1, which are superior to those reported for other NbS2-based HER electrocatalysts. Furthermore, few-layer NbS2 nanosheets are effective as bifunctional electrocatalysts for hydrogen production by overall water splitting, where the urea and hydrazine oxidation reactions replace the oxygen evolution reaction
    corecore